Advanced record linkage methods and privacy aspects for population reconstruction
نویسندگان
چکیده
Recent times have seen an increased interest into techniques that allow the linking of records across databases. The main challenges of record linkage are (1) scalability to the increasingly large databases common today; (2) accurate and efficient classification of compared records into matches and non-matches in the presence of variations and errors in the data; and (3) privacy issues that occur when the linking of records is based on sensitive personal information about individuals. The first challenge has been addressed by the development of scalable indexing techniques, the second through advanced classification techniques that either employ machine learning or graph based methods, and the third challenge is investigated by research into privacy-preserving record linkage. In this paper, we describe these major challenges of record linkage in the context of population reconstruction, outline recent developments of advanced record linkage methods, and provide directions for future research.
منابع مشابه
Advanced Record Linkage Methods and Privacy Aspects for Population Reconstruction - A Survey and Case Studies
متن کامل
Application of Advanced Record Linkage Techniques for Complex Population Reconstruction
Record linkage is the process of identifying records that refer to the same entities from several databases. This process is challenging because commonly no unique entity identifiers are available. Linkage therefore has to rely on partially identifying attributes, such as names and addresses of people. Recent years have seen the development of novel techniques for linking data from diverse appl...
متن کاملPrivacy preserving interactive record linkage (PPIRL)
OBJECTIVE Record linkage to integrate uncoordinated databases is critical in biomedical research using Big Data. Balancing privacy protection against the need for high quality record linkage requires a human-machine hybrid system to safely manage uncertainty in the ever changing streams of chaotic Big Data. METHODS In the computer science literature, private record linkage is the most publish...
متن کاملEvaluation of advanced techniques for multi-party privacy-preserving record link- age on real-world health databases
The linking of multiple (three or more) health databases is challenging because of the increasing sizes of databases, the number of parties among which they are to be linked, and privacy concerns related to the use of personal data such as names, addresses, or dates of birth. This entails a need to develop advanced scalable techniques for linking multiple databases while preserving the privacy ...
متن کاملScaling Private Record Linkage using Output Constrained Differential Privacy
Many scenarios require computing the join of databases held by two or more parties that do not trust one another. Private record linkage is a cryptographic tool that allows such a join to be computed without leaking any information about records that do not participate in the join output. However, such strong security comes with a cost: except for exact equi-joins, these techniques have a high ...
متن کامل